Overview of Results of the MUC-6 Evaluation
نویسنده
چکیده
The latest in a series of natural language processing system evaluations was concluded in October 1995 and was the topic of the Sixth Message Understanding Conference (MUC-6) in November . Participants were invited to enter their systems in as many as four different task-oriented evaluations . The Named Entity and Coreference tasks entailed Standard Generalized Markup Language (SGML) annotation of texts and were being conducted fo r the first time. The other two tasks, Template Element and Scenario Template, were information extraction tasks that followed on from the MUC evaluations conducted in previous years . The evolution and design of the MUC6 evaluation are discussed in the paper by Grishman and Sundheim in this volume . All except the Scenari o Template task are defined independently of any particular domain .
منابع مشابه
A L ENTITY TASK ( MET ) OVERVIEW Roberta Merchant
In November, 1996, the Message Understanding Conference-6 (MUC-6) evaluation of named entity identification demonstrated that systems are approaching human performance on English language texts [10]. Informal and anonymous, the MET provided a new opportunity to assess progress on the same task in Spanish, Japanese, and Chinese. Preliminary results indicate that MET systems in all three language...
متن کاملThe Multilingual Entity Task (MET) Overview
In November, 1996, the Message Understanding Conference-6 (MUC-6) evaluation of named entity identification demonstrated that systems are approaching human performance on English language texts [10]. Informal and anonymous, the MET provided a new opportunity to assess progress on the same task in Spanish, Japanese, and Chinese. Preliminary results indicate that MET systems in all three language...
متن کاملStatistical significance of MUC-6 results
The results of the MUC-6 evaluation must be analyzed to determine whether close scores significantl y distinguish systems or whether the differences in those scores are a matter of chance. In order to do such an analysis , a method of computer intensive hypothesis testing was developed by SAIC for the MUC-3 results and has been use d for distinguishing MUC scores since that time . The implement...
متن کاملComparing Muck-ii and Muc-3: Assessing the Difficulty of Different Tasks Overview
The natural language community has made impressive progress in evaluation over the last four years. However, as the evaluations become more sophisticated and more ambitious, a fundamental problem emerges: how to compare results across changing evaluation paradigms. When we change domain, task, and scoring procedures, as has been the case from MUCK-I to MUCK-II to MUC-3, we lose comparability of...
متن کاملOverview of the fourth message understanding evaluation and conference
The Fourth Message Understanding Conference (MUC-4) is the latest in a serie s of conferences that concern the evaluation of natural language processing (NLP ) systems. These conferences have reported on progress being made both in th e development of systems capable of analyzing relatively short English texts and in the definition of a rigorous performance evaluation methodology . MUC-4 was pr...
متن کامل